Parametric estimation of hidden stochastic model by contrast 
minimization and deconvolution: application to the Stochastic 

Volatility Model 



Salima El Kolei 



Abstract 



We study a new parametric approach for particular hidden stochastic models such as the Stochastic 
Volatility model. This method is based on contrast minimization and deconvolution. 

After proving consistency and asymptotic normality of the estimation leading to asymptotic confidence in- 
tervals, we provide a thorough numerical study, which compares most of the classical methods that are used 
in practice (Quasi Maximum Likelihood estimator. Simulated Expectation Maximization Likelihood estima- 
tor and Bayesian estimators). We prove that our estimator clearly outperforms the Maximum Likelihood 
Estimator in term of computing time, but also most of the other methods. We also show that this contrast 
method is the most robust with respect to non Gaussianity of the error and also does not need any tuning 
parameter. 

Keywords: contrast function, deconvolution, parametric inference, stochastic volatility. 
1 Introduction 

This paper is concerned with the particular hidden stochastic model: 



where (ei)i>i and (?7i)i>i are two independent sequences of independent and identically distributed (i.i.d) cen- 
tered random variables with variance cr^ and Uq. It is assumed that the variance is known. The terminology 
hidden comes from the unobservable character of the process (A^i)i>i since the only available observations are 



The dynamics of the process Xi is described by a measurable function h^^ which depends on an unknown 
parameter 0o and by a sequence of i.i.d centered random variables with unknown variance ctq. We denote 
by ^0 = (0Oifo) vector of parameters governing the process Xi and suppose that the model is correctly 
specified: that is, 6*0 belongs to the parameter space C M'', with r S N*. The question of estimating such a 
model is a real challenge and has been studied by many authors. In particular the field of applications is very 
large: radar models, genomic models, financial models... (see |CMR05j for a partial review). 

When b is linear in </)o and X^, the model ([ij is an autoregressive model with measurement error. K.C.Chanda 
provided in |Cha95bl ICha95aj an asymptotically normal estimator for the vector of parameters by using 
modified Yule Walker equation. When 6 is non linear, an efficient estimator of is given by the Maximisation 
Likelihood Estimator (MLE) . Recently, in [DRM"*" iT] , the authors show that such an estimator is consistent and 
asymptotically normal. In practice, the main difficulty arises from the unobservable character of the process 
Xi. The likelihood is then intractable: it is only available in the form of a multiple integral. Exact likelihood 
methods require simulations and have therefore an intensive computational cost. In many case, the MLE has 
to be approximated. 
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A popular approach to approximate the MLE consists in using Monte Carlo Markov Chain (MCMC) simula- 
tion techniques and importance sampling (IS). Thanks to the development of these methods, MLE have known 
a huge progress and Bayesian estimations have received more attention (see |SR93| ). Another method for per- 
forming MLE estimator consists in using the Expectation-Maximization (EM) algorithm proposed by Dempster 
et al. in 1977 (see |DLR77) V Nevertheless, since Xi is unobservable, this method requires to introduce a MCMC 
or IS method in the Expectation step. Although these methods are used in practice, it can be very expensive 
from a computational point of view. Some authors have proposed Sequential Monte Carlo algorithms (SMC) 
which allow to reduce the computational cost by a recursive construction. We refer to the book of Doucet et 
al. [DFGOT] for a complete review of these methods. 

We propose here an approach based on M-estimation: It consists in the optimisation of a well-chosen con- 
trast function (see [VaaOO) chapter p. 41 for a partial review) and deconvolution strategy. The deconvolution 
problem is encountered in many statistical situations where the observations are collected with random er- 
rors. This approach is much less greedy from a computational point of view. In the regression framework: 
Zi = Xi + Si, = b^g (Xi) +r]i^i, a method for estimating the parameter (/)o has been proposed by F. Comte 

and M. Taupin |CT01j . Their procedure of estimation is based on a modified least squared minimization. In 
the same perspective, J. Dedecker, A. Samson and M-L. Taupin in |DSTllj propose also the same procedure of 
estimation based on a weighted least squared estimation: Their assumptions on the process Xi are less restric- 
tive than those proposed by F. Comte and M. Taupin and they provide consistent estimation of the parameter 
00 with a parametric rate of convergence in a very general framework. Their general estimator is based on 
the introduction of a kernel deconvolution density and depends on the choice of a weight function. The ap- 
proach proposed here is different: it is not based on a weighted least squared estimation so that the choice of the 
weight function is not encountered in this paper. Moreover, it allows to estimate both the parameters 4>o and ctq. 

Our principle of estimation relies on the Nadaraya- Watson strategy and is proposed by F. Comte et al. in 
[FLllj in a non parametric case to estimate the function as a ratio of an estimate of Ig ~ b^fg and an 
estimate of fg, where fg represents the invariant density of the Xi. We propose to adapt their approach in 
the parametric context and suppose that the form of the stationary density fg is known up to some unknown 
parameter 9. The first purpose of the paper is to present our estimator and its statistical properties: Under 
weak assumptions, we show that it is a consistent and asymptotically normal estimator. Our work is purely 
parametric but we go further in this direction by proposing an analytical expression of t he asymptotic variance 



matrix S(6'„) which allows to construct confidence interval at level a (see Corollary 1.2) 



The models: In this paper, we focus on financial applications. In the popular Black & Sholes model (1973) 
(see |BS73j ). the returns of an asset price are modelled as a geometrical Brownian motion with constant volatil- 
ity. Since then, empirical works have shown that this assumption is not satisfied. In particular, it appears that 
the volatility of an asset return has also to be considered as a stochastic process. Many authors have therefore 
proposed Stochastic Volatility models which are widely used in finance and econometric. The real challenge 
is that the practitioner does not observe any realisation of the volatility process: only the assets prices are 
observed making Stochastic Volatility models belong to the class of hidden Markov models. Although the scope 
of our method is general, we have chosen to focus on the so-called SV model in this paper. We also investigate 
the behavior of the method on the simpler autoregressive process AR(1) with measurement noise which has 
been widely studied and on which our method can be more easily understood and compared with other ones. 
Our procedure allows to estimate the parameters of a large class of discrete Stochastic Volatility models (SV, 
ARCH-E model, Vasicek(1977), Merton(1973)...), which is a real challenge in financial application. 

(i) Gaussian Autoregressive AR(1) with measurement noise: It has the following form: 

\ Xi+i = <j>oXi + rii+i, 

where ei+i and ry^+i are two centered Gaussian random variables with variance cr^ assumed to be known and 
ctq assumed to be unknown. Additionally, we assume that |0o| < 1 which implies the stationary and ergodic 
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property of the process Xi |Dou94) . 



(ii) SV model: The discrete time Stochastic VolatiUty model (SV in short) was introduced by Taylor in 1982 
(see |Tay82| ) : it is directly connected to the type of diffusion process used in asset-pricing theory ( see |MT90| ) : 

I Y,+i = exp 



where and ryi+i are two centered Gaussian random variables with variance assumed to be known and 
ctq assumed to be unknown. 

By applying a log-transformation Z.+i = \og{Y^^^) - V.[\og{(f^^)] and e^+i = \og{£,f+^) - E[log(^,f_^i)], the SV 
model is a particular version of 0. We assume that |(/)o| < 1 and we refer the reader to [CJLOOj for the mixing 
properties of stochastic volatility models. 



Most of the computational problems stem from the assumptions that the innovation of the returns are Gaus- 
sian which translates into a logarithmic chi-square distribution when the model is transformed in a linear state 
space model. Many authors have ignored it in their implementation and many authors used some mixture of 
Gaussian to approximate the log-chi-square density. For example, in the Quasi-Maximum Likelihood (QML) 
method implemented by Jacquier, Poison and Rossi in .JPR 94 and in the Simulated Expectation-Maximization 
Likelihood estimator proposed (SIEMLE) by Kim, Shephard, and Chib in |KSG98| they used a mixture of 
Gaussian distribution to approximate the log-chi-square distribution. Harvey jHRS94j used the Kalman filter 
to estimate the likelihood of the transform state space model, hence the model was also assumed to be Gaussian. 



Gonsequently, the second purpose of this paper consists in comparing our contrast estimator with different 
estimations: the QML, the SIEMLE and Bayesian estimators. From a computational point of view, we show 
that as well as a straightforward implementation, the contrast estimator is faster than the SIEMLE estimator 
(see Table [l] in Section 2.3.11. Besides, we show that our estimator outperforms the QML and Bayesian esti- 
mators. 



Organization of the paper: The remainder of the paper is organized as follows: the procedure of es- 
timation and its statistical properties are presented in Section |1.1[ Section [2] contains the numerical study: 
In Section |2.2| we give the parameter estimates and the comparison with others ones for simulation data and 
Section [2^ contains the study on real data. We compare our contrast estimator with other ones on the SP&500 
and FTSE index. Finally all results are proved in Appendix |B] and [C| 

Notations: We denote by: u*{t) — J e'^^^u{x)dx the Fourier transform of the function u{x) and {u,v) — 

1/2 

J u{x)v{x)dx with vv — jwp. We write ||u||2 — (/ \u{x)\'^dx^ the norm of u{x) on the space of functions 
L^(M). By property of the Fourier transform, one has {u*)*{x) 2ttu{—x) and (ui,U2) = 5^ {ul,ul). The 
vector of the partial derivatives of / with respect to (w.r.t) 9 is denoted by Vg/ and the Hessian matrix of / 
w.r.t is denoted by Vg/ . The Euclidean norm matrix, that is, the square root of the sum of the squares 
of all its elements will be written by ||^||. We denote by the pair (Zi.Zi^i) and = {zi,Zi^i) is a given 
realisation of Z.^. 

In the following, P, E, Var and Gov denote respectively the probability Pg^ , the expected value Ego , the vari- 
ance Vargg and the covariance Gov^^ when the true parameter is ^o- Additionally, we write P„ {resp. P) the 
empirical expectation [resp. theoretical), that is: for any stochastic variable X: P„(X) = ^ "'^i {resp. 
F{X)=E[X]). 



4 



1 INTRODUCTION 



1.1 Procedure: Contrast estimator 

Hereafter, we propose explicit estimators of the parameter ^o- This estimator called the contrast estimator is 
based on minimization of suitable functions of the observations usually called "contrasts functions" . We refer 
the reader to |VaaOO| for a general account on this notion. Furthermore, in this part, we use the contrast 
function proposed by [CLRlOj . that is: 



1 

P„TOe = - VTOe(Zi), (4) 
n ^ — ' 

with n the number of observations and: 

me(zj) : (61, z,) e (9 x M^) i-^ TOe(zi) = \\le\\l - 2zi+iM;*^ (z,), 
where the function and w„ are given by: 



1 v'^ { — 

k{x) = b^{x)fg{x) and u^{x) = — , , (5) 
with fg the invariant density of Xi . 

Some assumptions. As our procedure relies on the Fourier deconvolution strategy, in order to construct 
our estimator, we assume that the density of the noise e^, denoted by /g, is fully known and belongs to L2(ffi), 
and for all a: € M fei^) 7^ 0- Furthermore, we assume that the function Ig belongs to Li(E) n L2(M), is twice 
continuously differentiable w.r.t 9 Cz Q for any x and measurable w.r.t x for all 6 in O. Additionally, each 
coordinate of Wglg and each coordinate of Vglg belong to Li(E) n L2(M). The function uig must be integrable 
and each coordinate of uveZe and each coordinate of wy^j^ have to be integrable as well. 

For the statistical study, the key assumption is that the process {Xi)i>i is stationary and ergodic (see jCJLOOj 
for a definition). 

Let us explain the choice of the contrast function and how the strategy of deconvolution works. Obviously, 
as the model ([!]) shows, the are not i.i.d. However, by assumption, they are stationary ergodic, so the 
convergence of P„me to Pmg = K[mg(7ii)] as n tends to the infinity is provided by the Ergodic Theorem. 
Moreover, the limit E [mg{Zi)] of the contrast function can be explicitly computed: 



E[me{Z,)] = \\lg\\l-2E [Z^uHZ,)] . 
By Eq.Q and under the independence assumptions of the noise (£2) and (772), we have: 



E[Z2<(Zi)] =E[6^„(XiX(Zi)] 



Using Fubini's Theorem and Eq.Q, we obtain: 



1.2 Asymptotic properties of the Contrast estimator 
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Then, 
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E[6^„(Xi)/e(Xi)] 



E[b^,iX,)e'^-y] {le{-y)rdy 



-E 



-E[6^„(Xi)(ae(-Xi))*)* 



^00 {^)f0a {x)b^{x)fg{x)dx 



E[me(Zi)] 



||^0||2 ^ 2 (^0, Zeo) , 
ll^e - ^eoll2 ~ ll^eoll2 ■ 



(6) 
(7) 



The computation above is the key of the deconvolution procedure. Note that, in a smooth framework (unique- 
ness assumption), this quantity is minimal when 9=9o. 



Hence, the associated minimum-contrast estimators 0„ is defined as any solution of: 

9n = argminP„me. 

see 



(8) 



Remark. We refer the reader to |Dou94| for the proof that if Xi is an ergodic process then the process Zi , which 
is the sum of an ergodic process with an i.i.d. noise, is again stationary ergodic. Furthermore, by the definition of 
an ergodic process, if Zi is an ergodic process then the couple = {Zi,Zi^i) inherits the property (see |CJLOO] ). 



1.2 Asymptotic properties of the Contrast estimator 

Our proof holds under the following assumptions. For the reader convenience, we denote by (E) {resp. (C) 
and (T)) the assumptions which serve us for the existence {resp. Consistency and Central Limit Theorem). If 
the same assumption is needed for two results, for example for the existence and the consistency, it is denoted 
by (EC). 



(ECT): The parameter space is a compact subset of W and 6*0 is an element of the interior of 9. 
(C): (Local dominance): E [supggQ |Z2M;*^(Zi)|] < oo. 

(CT): The application 9 i— > Fmg admits an unique minimum and its Hessian matrix denoted by Vg is 
non-singular in ^o- 

(T): (Moment condition): For some d > and for j e {1, • • • , r}: E 



\Z2U*s^{Z,)\ 

00, 



2+S 



< OO. 
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(Hessian Local dominance): For some neighbourhood U of and for /c e {1, • • • , r}: 



E 



SUp|Z2W%2,^ {Zi)\ 



em 



< oo. 



Let us introduce the matrix: 



+ 00 



i](6i) = Vg-^n{0)Vg-^' with n{9) = ni{9) + 2Y,^]{0)- 

where ni{0) = Var (Veme(Zi)) and %(6i) = Gov (VeTO0(Zi), Veme(Zj)) 
Theorem 1.1. Under our assumptions, let On be the minimum- contrast estimator defined by liSp. Then: 



Oo 



in probability as n ^ oo. 



Moreover, ifLi is geometrically ergodic (see Definition A.l in Appendix^^, then: 

\/n{9n — 6o) ^ JV (0, S(0o)) in law as n ^ oo. 

The following corollary gives an expression of the matrix fl{6o) and Veg of Theorem |1.1| for the practical 
implementation: 

Corollaire 1.2. Under our assumptions, the matrix fl{9o) is given by: 

+ 00 

where: 

n^ieo)^iE[zl{u*^^,^iz,)Y] -4E[&^,(Xi)(v,;,(Xi))]E[&^„(Xi)(Ve;.(Xi))]'. 

And, the covariance terms are given by: 

n,{eo) = 4 - E (WeleiXi))] E [b^„{Xi) (Ve/e(^i))]'] , 

where C, = E {V M^i)) (60o(^j)Ve?e(^j))'] • 

Furthermore, the Hessian matrix Vgg is given by: 



Let us now state the strategy of the proof, the full proof is given in Appendix |Bj Clearly, the proof of The- 
orem |L1| relies on M-estimators properties and on the deconvolution strategy. The existence of our estimator 
follows from regularity properties of the function Ig and compactness argument of the parameter space, it is 
explained in Appendix |B.l[ The key of the proof consists in proving the asymptotic properties of our estimator. 
This is done by splitting the proof into two parts: we first give the consistency result in Appendix |B . 2| and then 
give the asymptotic normality in Appendix |B.3[ Let us introduce the principal arguments: 

The main idea for proving the consistency of a M-estimator comes from the following observation: if P„me 
converges to Pme in probability, and if the true parameter solves the limit minimization problem, then, the 
limit of the argminimum 0„ is 9^. By using an argument of uniform convergence in probability and by com- 
pactness of the parameter space, we show that the argminimum of the limit is the limit of the argminimum. A 
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standard method to prove the uniform convergence is to use the Uniform Law of Large Numbers (see Lemma 
A.2| in Appendix|A]). Combining these arguments with the dominance argument (C) give the consistency of our 
estimator, and then, the first part of Theorem 

The asymptotic normahty foUows essentially from Central Limit Theorem for a mixing process (see |Jon04j ) . 
Thanks to the consistency, the proof is based on a moment condition of the Jacobian vector of the function 
me{z) and on a local dominance condition of its Hessian matrix. To refer to likelihood results, one can see these 
assumptions as a moment condition of the score function and a local dominance condition of the Hessian. 



2 Applications 

2.0.1 Contrast estimator for the Gaussian AR(1) model with measurement noise: 

Consider the following autoregressive process AR(1) with measurement noise: 

Xi+1 — (poXi + rji^i, 

The noises Si and r]i are supposed to be centered Gaussian randoms with variance respectively and ctq. 
We assume that cr^ is known. Here, the unknown vector of parameters is 0o = ((/)o,crg) and for stationary and 
ergodic properties of the process Xi, we assume that the parameter 0o satisfies |(/)o| < 1 |Dou94| . The functions 
bd, and Ig are defined by: 



b^{x) : {x, 6*) e (M X e) b^{x) = (j)x, 



1 



„2 



le{x) : {x,0) e (M X e) i-^ le{x) = b^{x)fg{x) = — ^==xexp y-^^ 

where 7^ = jz^- The vector of parameter 9 belongs to the compact subset 9 : 6 = [— 1 + r; 1 — r] x [(T^j„; tr^, 
with (T^j„ > cr^ + r where r, f, cr^j„ and a-^^^ are positive real constants. We consider this subset since by 
stationary of Xi, the parameter |0| < 1 and by construction the function u* is well defined for cr^ > (7^(1 — 0^) 



maxl 



with <j) E [— 1 + r; 1 — r] which implies that > a^. The contrast estimator defined in (1.1 ) has the following 
form: 



with n the number of observations. Theorem 1 1 . 1 1 applies here and the corresponding result for the Gaussian 
AR(1) model is given in Appendix C.l As we already mentioned. Corollary 1.2 allows to compute confidence 
intervals: For alH = 1, 2: 



le'j:{en)e, , /e^S(0„)e, , ^ 



where zi_q/2 is the 1 — a/2 quantile of the Gaussian law, 0o,i is the z*'' coordinate of 6q and e^ is the 
coordinate of the vector of the canonical basis of K^. The covariance matrix E(0„) is computed in Lemma 
in Appendix |C.1.3[ 



C.l 
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2.0.2 Contrast estimator for the SV model: 

We consider the following SV model: 

1 -exp(^)ef+i, ^^^^ 

with /3 a positive constant. The noises ^i+i and 77^+1 are two centered Gaussian random variables with 
standard variance cr^ and ctq. We assume that |0o| < 1 and we refer the reader to |CJLOO| for the mixing 
properties of this model. 

In the original version of the SV model introduced by Taylor in | Tay82| , the constant /? is equal to 1. In 
this case, by applying a log-transformation Zi+i = log{Y^^^^) - E[log(4^^_;^)] and e^+i = log(^f+i) - E[log(4^^;^)], 
the log-transform SV model is given by: 

Zi+l = X^^i + £j+i ^^^^ 

The Fourier transform of the noise e^+i is given by: 
where E = E[log{^f^i)] and Var[log(^j?^]^)]= ct^ — Here, T represents the gamma function given by: 

r+co 

r : u ^ / t^'-^e'^dt Vm e C such that 7^e(^i) > 0. 
Jo 

The vector of parameters 6 = (0, a^) belongs to the compact subset 9 = [—1 + r; 1 — r] x [(T^„i„; cr^ax] ^ith 
r, cr^i„ and a'^^^ positive real constants. 



Our contrast estimator (1.1 1 is given by: 

I -101/7^ cxp( =1^7^) \ 



Y,Z^+lul{Z,)\, (13) 



with ui^ (y) 



2^7? I cxp(-i£i/)2*«r(i+ii/) 



Theorem 1.1 applies and by Slutsky's Lemma we also obtain confidence intervals. We refer the reader to 
Appendix |C . 2 1 for the proof. 



2.1 Comparison with the others methods 
2.1.1 QML Estimator 

For the SV model, the QML estimator, proposed independently by Nelson(1988) (see |Nel945 ) and Harvey et 
al.(1994) (see jHRS94| ] is based on the log-transform model given in (12). Making as if the Si were Gaussian 
in the log-transform of the SV model, the Kalman filter |Kal90j can be applied in order to obtain the quasi 
likelihood function of Zi-n = (Zi,--- , Z^) where n is the sample data length. For the AR(1) and the log- 
transform of the SV model, the log-likelihood l{9) is given by: 

m = log/,(Zi:„) = -^log(2^) --J2\ogF, - 2 E ^' 

i—1 i—1 * 



2.1 Comparison with the others methods 
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where Vi is the one-step ahead prediction error for Zi, and Fi is the corresponding mean square error. More 
precisely, the two quantities are given by: 

= {Z, - Zr) and F, ^ Yiirg[v,] = + cr^, 

where Z~' — Eg[Zi\Zi,i_i] is the one-step ahead prediction for Zi and — VaTg[{Xi ~ X^)^] is the one-step 
ahead error variance for Xi. 

Hence, the associated estimator of is defined as a solution of: 

9n = are max/ (6*). 
see 

Note that this procedure can be inefficient: the method does not rely on the exact likelihood of the Zi-n and 
approximating the true log-chi-square density by a normal density can be rather inappropriate (see Figure [I] 
below). 




Figure 1: Approximation of the log-chi-square density (Red) by a Gaussian density with mean £ — —1.27 and 
variance cr^ = ^ (Black). 

2.1.2 Sequential Bayesian Estimators: Bootstrap, APF and KSAPF 

In the Bayesian approach, the vector of parameters — {4>, cr^) is supposed random obeying the prior distri- 
bution assumed to be known. We propose to use the Kitagawa and al.'s approach (see |DFG01| chapter 10 
p. 189) in which the parameters are supposed time-varying: 0^+1 — 9i + Gi+i where C/i+i is a centered Gaus- 
sian random with a variance matrix Q supposed to be known. Now, we consider the augmented state vector 
Xi^i = (Xi^ijOi+i)' where Xi^i is the hidden state variable and the unknown vector of parameters. In 
this paragraph, we use the terminology of the particle filtering method, that is: we call particle a random 
variable. The sequential particle estimation of the vector Xi^i consists in a combined estimation of Xi^i and 
9i^i. For initialisation the distribution of Xi [^conditionally to 6i is given by the stationary density fe^. 



For the comparison with our contrast estimator (1.1 1, we use the three methods: the Bootstrap filter, the 



Auxiliary Particle filter (APF) and the Kernel Smoothing Auxiliary Particle filter (KSAPF). We refer the reader 
to [DFGOT] . P^7J and [LWOl^ for a complete revue of these methods. 

Remark. Let us underline some particularity of the combined state and parameters estimation: For the 
Bootstrap and APF estimator, an important issue concerns the choice of the parameter variance Q since the 



^ To avoid confusion between the true value Bq and the initial value 6i in the Bayesian algorithms, we start the algorithms with 
i = 1. 
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parameter is itself unobservable. If one can choose an optimal variance Q the APF estimator could be a very 
good estimator since with arbitrary variance the results are acceptable (see Table [s]). In practice, Q is chosen 
by an empirical optimization. The KSAPF is an enhanced version of the APF and depends on a smooth factor 
< h < 1 (see |LW01| ). Therefore, the choice of h is another problem in practice. 



2.2 Application to Simulated Data 

For the AR(1) and SV model, we sample the trajectory of the Xi with the parameters 0o = 0.7 and ctq = 0.3. 
Conditionally to the trajectory, we sample the variables Zi for i = 1 ■ ■ ■ n where n represents the number of 
observations. We take n = 1000 and a'} = 0.1 for the two models. This means that we take /3 = ^ for the SV 

model given in ( 11 ). In this case, the Fourier transform of Si+i is given by: f*{y) = cxp (^—i£y^ (s ^ '''Pv) 



with E = /3E[log(CH_i)](see Appendix C.2) 



Since particle filters are advanced Monte Carlo algorithms, we take a large number of particles M . For the 
three methods , we take M equal to 5000. Note that for the Bayesian procedure (Bootstrap, APF and KSAPF), 
we need a prior on 6, and this only at the first step. The prior for 6i is taken to be the Uniform law and 
conditionally to 9i the distribution of Xi is the stationary law: 

r p{6i) = UiO.5, 0.9) X U{0.1, 0.4) 

We take h = 0.1 for the KSAPF and Q ^ ^■^■'^^^ ^ ^ ^ iQ-e^ ^'^^ Bootstrap filter. 

Remark. By choosing the stationary law and the Uniform law around the true parameters we bias favourably 
the Bayesian algorithms. 



2.3 Numerical Results 



In the numerical section we compare the different estimations: the QML estimator defined in Section |2.1.1[ 
the Bayesian estimators defined in Section 2.1.2 and our contrast estimator defined in Section [l.l| For the 
comparison of the computing time, we also compare our contrast estimator with the SIEMLE proposed by Kim, 
Shepard and Chib (see Appendix D.l and |KSC98| for more general details). 



2.3.1 Computing time 

From a theoretical point of view, the MLE is asymptotically efficient. However, in practice since the states 
[Xi • • • , Xn) are unobservable and the SV model is non Gaussian, the likelihood is untractable. We have to use 
numerical methods to approximate it. In this section, we illustrate the SIEMLE which consists in approximating 
the likelihood and applying the Expectation-Maximisation algorithm introduced by Dempster |DLR77| to find 
the parameter 9. 

To illustrate the SIEMLE for the SV model, we run an estimator with a number of observations n equal to 
1000. Although the estimation is good the computing time is very long compared with the others methods (see 
Table [l] and Figure [2]). This result illustrates the numerical complexity of the SIEMLE (see Appendix D.l) 



Therefore, in the following, we only compare our contrast estimator with the QML and Bayesian estimators. 
The results are illustrated by Figure [2]. We can see that our contrast estimator is the fastest for the Gaussian 
AR(1) model. The QML is the most rapid for the SV model since it assumes that the measurement errors are 
Gaussian but we show in Figures [s], [4] and [s] that it is a biased estimator with large mean square error. For 
our algorithm, for the Gaussian AR(1) model, the function u*i has an explicit expression but for the SV model, 
the function u*i is approximated numerically since the Fourier transform of the function u;^ has not an explicit 



2.3 Numerical Results 
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form. This explains why our algorithm is slower on the SV model than on the Gaussian AR(1) model In spite 
of this approximation, our contrast estimator is fast and its implementation is straightforward. 



00 




4>n 




CPU (sec) 


0.7 


0.3 


0.667 


0.2892 


74300 



Table 1: SIEML estimation for the SV model. The number of observations is rt = 1000 and the number of 



sweeps for the Gibbs sampler is M — 100 (see Appendix D.l) 




— t — KSAPF 


n-number of observations 




— \ — Bootstrap 






— • — APF 






1 

* T~ 


1 1 1 1 1 
—'^ 1 1 1 'i 1 1 


1 
1 



-6— QML 
— " — Contrast 



10DD 1200 
n=number of obsenyations 




Figure 2: Top: Comparison of the computing time (CPU in seconds) with respect to the number of observations 
n = 100 up to 2000 for the Gaussian AR(1) model. The number of particles in Bayesian estimations is M — 5000 
particles. Bottom: Comparison of the CPU with respect to the number of observations n = 100 up to 1500 for 
the SV model with M = 5000 particles. 

2.3.2 Parameter estimates 

For the AR(1) Gaussian model, we run N — 1000 estimates for each method (QML, APF, KSAPF and Bootsrap 
filter) and N = 500 for the SV model. The number of observations n is equal to 1000 for the two models. 
In order to compare with others the performance of our estimator, we compute for each method the Mean 
Square Error (MSE) defined by: 

^We use a quadrature method implemented in Matlab to approximate the Fourier transform of uig(y). One can also use the 
FFT method and we expect that the contrast estimator will be more rapid in this case. 
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MSB - ^ > M - + {a - atY , (14) 



We illustrate by boxplots the different estimates (see Figures [3] and [i] ) . We also illustrate in Figure [s] the 
MSE for each estimator computed by equation( 14 ). We can see that, for the parameter (j)o, the QML estimator is 
better for the Gaussian AR(1) model than for the SV model (see Figure [S]). Indeed, the Gaussianity assumption 
is wrong for the SV model. Moreover, the estimate of CTq by QML is very bad for the two models (see Figure [i]) 
and its corresponding boxplots have the largest dispersion meaning that the QML method is not very stable. 
The Bootstrap, AFF and KSAPF have also a large dispersion of their boxplots, in particular for the parameter 
00 (see Figure [s]). Besides, the Booststrap filter is less efficient than the APF and KSAPF. For the Gaussian 
and SV model, the boxplots of our contrast estimator show that our estimator is the most stable with respect to 
(/)o and we obtain similar results for CTq. The MSE is better for the SV model and the smallest for our contrast 
estimator. 



Figure 3: Boxplot of (p. True value: (po — 0.7. Left: Gaussian AR(1) model. Right: SV model. 



Figure 4: Boxplot of cr^. True value: ctq = 0.3. Left: Gaussian AR(1) model. Right: SV model. 



Figure 5: MSE computed by Eq.(14). Left: Gaussian AR(1) model. Right: SV model. 



2.3.3 Confidence Interval of the contrast estimator 



To illustrate the statistical properties of our contrast estimator, we compute for each model the confidence 
intervals for = 1 estimator and the coverages for N = 1000 with respect to the number of observations. The 
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coverage corresponds to the number of times for which the true parameter 9o^i, i = 1,2 belongs to the confidence 
interval. The results are illustrated by the Figures [6]-[7] and |8]: for the Gaussian and SV models, the coverage 
converges to 95% for a small number of observations. As expected, the confidence interval decreases with the 
number of observations,. Note that of course a MLE confidence interval would be smaller since the MLE is 
efficient but the corresponding computing time would be huge. 




500 1000 1 500 2000 2500 3000 5500 4000 4500 5000 



Figure 6: Coverage with respect to the number of observations n — 100 up to 5000 for N = 1000 estimators . 
Top: Gaussian AR(1) model. Bottom: SV model. 




Figure 7: Confidence interval for the parameter 0o with respect to the number of observations n = 100 up to 
5000 for = 1 estimator. Left: Gaussian AR(1) model. Right: SV model. 




Figure 8: Confidence interval for the parameter <Tq with respect to the number of observations n = 100 up to 
5000 for A^ = 1 estimator. Left: Gaussian AR(1) model. Right: SV model. 
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2.4 Application to Real Data 

The data consist of daily observations on FTSE stock price index and S&P500 stock price index. The series 
taken in boursorama.com are closing prices from January, 1, 1973 to November, 2, 1976 for the FTSE and 
September, 24, 2001 to July, 7, 2005 for the S&PSOO leaving a sample of 1000 observations for the two series. 
The daily prices Si are transformed into compounded rates returns centered around their sample mean c for 
self-normalization (see |MS98| and |GHR96j ) yi = 100 x log (5^) ~ c. We want to model those data by the 
SV model defined in ( 12 1 leading to : 



= log(2/2) + 1.27 



Those data are represented on Figure [9] . 





Sample Autocorrelation Function 
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SampleAutocorrelatlon Function 



T M M I M T M T 11 t T T T T - 



10 



Figure 9: Top Left: Graph of Zj=FTSE. Top Right: Graph of Zi=SP500. Bottom Left: Autocorrelation of 
Z,;=FTSE. Bottom Right: Autocorrelation of Z,;=SP500. 



2.4-1 Parameter Estimates 



In the empirical analysis, we compare the QML, the Bootstrap filter, the APF and the KSAPF estima- 
tors. The last one is our contrast estimator. The variance of the measurement noise is cr^ = that 
is /3 is equal to 1 (see Section 2.0.21. Table [2] summarises the parameter estimates and the computing 
time for the five methods. For initialization of the Bayesian procedure, we take the Uniform law for the 
parameters p{0i) — W(0.4,0.95) x Z^(0. 1,0.5) and the stationary law for the log-volatility process Xi, i.e, 

/e,(Xi)=A-(0,^~ 



The estimates of (j) are in full accordance with results reported in previous studies of SV models. This 
parameter is in general close to 1 which implies persistent logarithmic volatility data. Furthermore, by our 
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simulation study, we are inclined to trust the contrast estimator. Therefore, we compute the corresponding 
confidence intervals at level 5% (see Table [s]). For the SP500, note that the Bootstrap filter and QML are not 
in the confidence interval for the parameter a^. Furthermore, for the FTSE, the Bootstrap filter is not in the 
confidence interval for the parameter (/). These results are consistent with the simulations where we proved that 
both methods were biased (see Section 2.3.2). The coefficients of the autocorrelation for the FTSE are higher 
than for the S&P500 (see Figure [9]), which explains the large confidence interval for the FTSE. Indeed, the 
covariance terms of the variance matrix given in Corollary |1.2| are large in this case. Note also that as expected 
the computing time for the QML is the shortest because it assumes Gaussianity which is probably not the case 
here. Except of QML, the contrast is the fastest method. The results are presented in Table [2] below. 



Index 


FTSE 


SP500 








CPU 






CPU 


Contrast 


0.96 


0.12 


114 


0.865 


0.119 


108 


Bootstrap filter 


0.784 


0.39 


461.97 


0.830 


0.247 


358 


APF 


0.90 


0.317 


472 


0.917 


0.179 


405 


KSAFF 


0.916 


0.255 


455 


0.929 


0.239 


502 


QML 


0.906 


0.279 


31.67 


0.895 


0.257 


104 



Table 2: Parameter estimates: n = 1000 and the number of particles M — 5000 for the Bayesian estimations. 
The CPU is in seconds. 



Index 


Confidence Interval 






a' 


SP500 


[0.791;0.939] 


[0.005;0.233] 


FTSE 


[0.822; 0.99] 


[ 0.001; 0.57] 



Table 3: Confidence interval at level 5%. 



2.5 Summary and Conclusions 

In this paper we propose a new method to estimate an hidden stochastic model on the form ([T]). This method 
is based on the deconvolution strategy and leads to a consistent and asymptotically normal estimator. We 
empirically study the performance of our estimator for the Gaussian AR(1) model and SV model and we are 
able to construct a confidence interval (see Figures [t] and [s]). As the boxplots |3] and [i] show, only the 
Contrast, the APF, and the KSAPF estimators are comparable. Indeed the QML and the Bootstrap Filter 
estimators are biased and their MSE are bad, and in particular, the QML method is the worst estimator (see 
Figure [5]). One can see that the QML estimator proposed by Harvey et al. is not suitable for the SV model 
because the approximation of the log-chi-square density by the Gaussian density is not robust (see Figure [l]). 
Furthermore, if we compare the MSE of the three Sequential Bayesian estimation, the KSAPF estimator is the 
best method. From a Bayesian point of view, it is known that the Bootstrap filter is less efficient than the APF 
and KSAPF filter since by using the density transition as the importance density, the propagation step of the 
particles will be made without taking care the observations (see |DJ11| ). 

Among the three estimators (Contrast, APF, and KSAPF) which give good results our estimator outperforms 
the others in a MSE aspect (see Figure [5]). Moreover, as we already mentioned, in the combined state and 
parameters estimation the difRculties are the choice of Q, h and the prior law since the results depend on these 
choices. In the numerical section, we have used the stationary law for the variable Xi and this choice yields 
good results but we expect that the behavior of the Bayesian estimation will be worse for another prior. The 
implementation of the contrast estimator is the easiest and it leads to confidence intervals with a larger variance 
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than the SIEMLE but at a smaller computing cost, in particular for the AR(1) Gaussian model (see Figure [2]). 
Furthermore, the contrast estimator does not requires an arbitrary choice of parameter in practice. 
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A M-Estimator 

Definition A.l. Geometrical ergodic process 

Denote by Q^{x^ .) the transition icernel at step rt of a (discrete-time) stationary Markov chain (X„)„ which 
started at x at time 0. That is, Q^ix^ F) = P(X„ G F\Xo — x). Let tt denote the stationary law of X„ and let 
/ be any measurable function. We call mixing coefficients (/?„)„ the coefficients defined by, for each n: 



/3n = 



sup |g"(x,/)-^(/)i 

ll/lloo<l 



Tr{dx), 



where 7r(/) = J f{y)Tr{dy). We say that a process is geometrically ergodic if the decreasing of the sequence of 
the mixing coefficients is geometrical, that is: 

3 < < 1, such that /3„ < 77". 

The following results are the main tools for the proof of Theorem |1.1[ 

Consider the following quantities: 

-, n 1 " 1 ^ 

Pnhg = - V he{Yi); P,,Sg = - V ^eheiYi) and P,,Hg = - V VjheiYi) 
n ^ — ' n ^ — ' n ^ — ' 

where hg{y) is real function from G x 3^ with value in M. 

Lemma A. 2. Uniform Law of Large Numbers (ULLN)(see \NM94^ for the proof.) 

Let {Yi) he an ergodic stationary process and suppose that: 

L hg{y) is continuous in for all y and measurable in y for all 9 in the compact subset Q. 

2. There exists a function s{y) (called the dominating function) such that \hg{y)\ < s{y) for all 9 ^ Q and 
E[s(yi)] < 00. Then: 

sup \Pnhg — 'Phg \ — )■ in probability as n ^ 00. 
Moreover, Phg is a continuous function of 9. 

Proposition A. 3 ( proposition 7.8 p. 472 in |HayOO| . The proof is in jTak85) theorem 4.1.5.). Suppose that: 

1. 9{) is in the interior ofO. 

2. hg{y) is twice continuously differentiable in 9 for any y. 

3. The Hessian matrix of the application 6 M- Phg is non-singular. 

4. ^/nPnSg — > M{0, ^l{9)) in law as n 00, with Q{9) a positive definite matrix. 

5. Local dominance on the Hessian: for some neighbourhood lA of 9o : 



E 



sup||V^/ie(yi)| 
I9eu 



< 00, 
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B PROOFS OF THEOREM 



1.1 



so that, for any consistent estimator 6 of 6q we have: P„i?g — > E[Vg/i0(Yi)] in probability as n ^ oo. 
Then, 9 is asymptotically normal with asymptotic covariance matrix given by: 

I](0o) = E[Wlhe{Yi)]-^n{eo)E[Vlhe{Yi)]-^ 
Proposition A. 4 (The proof is in | Jon04| ) . 

Let Yi be an ergodic stationary Markov chain and let 3; 3^ — > M a borelian function. If: 
1. Yi is geometrically ergodic and E [Islyi)!^^"^] < 00 for some 5 > Q. 
Then, when 71 — >■ 00 we have: 

^/n{Png - Pg) J\f{0, ag) in law, 
where := Var [(g(Yi)] + 2 Gov {g{Yi), g(Y,)) < 00 

B Proofs of Theorem 



For the reader convenience we spht the proof of Theorem 1.1 into three parts: in Subsection B.l we give the 
proof of the existence of our contrast estimator defined in (1.1). In Subsection B.2 we prove the consistency, 
that is, the first part of Theorem |1.1[ Then, we prove the asymptotic normahty of our estimator in Subsection 
B.3[ that is, the second part of Theorem |1.1| The Section [B.4| is devoted to CoroUary |1.2| FinaUy, in Section 
Cjwe prove that Theorem 1 1 . 1 1 apphes for the AR(1) and SV models. 



By assumption, the function ^ \\le\\\ is continuous. Moreover, l*g and then u*i^{x) — ^ J e"^ '^L ^} dy are 



B.l Proof of the existence and measurability of the M-Estimator 

continuous w.r.t 6. In particular, the function mg{zi) = ||^e|j2 ~ '^^i+i'^^igi^i) is continuous w.r.t 9. Hence, the 
function P„me ~ ^ X^ILi is continuous w.r.t 9 belonging to the compact subset Q. So, there exists 9 

belongs to 8 such that: 

inf Pn'me — Pnms. □ 



B.2 Proof of the Consistency 

By assumption Ig is continuous w.r.t 9 for any x and measurable w.r.t x for all 9 which implies the continuity 
and the measurability of the function PnTUg on the compact subset 0. Furthermore, the local dominance 
assumption (C) implies that E [supg^Q |me(Zi)|] is finite. Indeed, 



< \\lg\\r^ + 2\z,+iul{z,)\ 



As ||^e||2 is continuous on the compact subset 9, sup gpp, j |/e||^ is finite. Therefore, E [supggg |TOe(Zi)|] is 



finite if E [supggQ is finite. Lemma ULLN A. 2 gives us the uniform convergence in probability 

of the contrast function: for any e > 0: 



lim I 

n— f+00 



sup |P„TOe - Pmel < e =1. 

eee 



Combining the uniform convergence with Theorem 2.1 p. 2121 chapter 36 in |EM94) yields the weak 
(convergence in probability) consistency of the estimator. □ 



B.3 Proof of the asymptotic normality 



19 



Remark. In most applications, we do not know the bounds for the true parameter. So the compactness 
assumption is sometimes restrictive, one can replace the compactness assumption by: is an element of the 
interior of a convex parameter space C M''. Then, under our assumptions except the compactness, the 
estimator is also consistent. The proof is the same and the existence is proved by using convex optimization 
arguments. One can refer to |HayOO| for this discussion. 

B.3 Proof of the asymptotic normality 

The proof is based on the following Lemma: 

Lemma B.l. Suppose that the conditions of the consistency hold. Suppose further that: 
1. Zi geometrically ergodic. 



2. (Moment condition): for some S > and for each j G {1, • • ■ , r} : E I ^"ge^^"* P^*^ 

3. (Hessian Local condition) : For some neighbourhood hi of 6q and for j,k £ {1, ■ ■ ■ ,r} : 



< oo. 



E 



sup 

.9eu 



< oo. 



Then, On defined in Eq. is asymptotically normal with asymptotic covariance matrix given by: 

where Vgg is the Hessian of the application Pmg given in Eq.^. 

Proof. The proof follows from Fumio's Proposition |A.3| and Galin's Corollary |A.4[ 



□ 



It just remains to check that the conditions (2) and (3) of Lemma B.l hold under our assumptions (T) 



Moment condition: As the function Ig is twice continuously differentiable w.r.t 0, for all S K^, the 
application mg{zi) : 6* S 8 i— >■ me{zi) = H^elli ~ 2zi+iM^^(zi) is twice continuously differentiable for all € 8 
and its first derivatives are given by: 



By assumption, for each j € {1, • • • ,r}, ^ G Li(M), therefore one can apply the Lebesgue Derivation 



Theorem and Fubini's Theorem to obtain 

Veme{zi) = [Ve||Ze||2 - 2zi+iu*^ ^i^{zij\ 

Then, for some 5 > 0: 



(15) 



\2+S 



(16) 



where Ci and C2 are two positive constants. By assumption, the function \\le\\2 is twice continuously differ- 



entiable w.r.t 9. Hence, VellZglH is continuous on the compact subset 8 and the first term of equation (16) is 
finite. The second term is finite by the moment assumption (T). 



Hessian Local dominance: For j, fc G {1, • • • ,r}, gg.g'g^ G Li(M), the Lebesgue Derivation Theorem gives: 

Veme(zj) V9||;e||2 - 2zj+im^2;^ (z,). 
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B PROOFS OF THEOREM 



1.1 



And, for some neighbourhood U oi 9q: 



E 



sup ||V^me(Zi)| 



< sup||V^||Ze||2| 



2E 



sup 

9eu 



Zi+iu*^2i^{Zi) 



The first term of the above equation is finite by continuity and by compactness argument. And, the second 
term is finite by the Hessian local dominance assumption (T). □ 



B.4 Proof of Corollary 



1.2 



By replacing "S/ emoiTii) by its expression (f5l, we have 



Var [Ve\Ml-2Z2U^^i^{Z,)\ 
War [Z2U*^^i^{Z,)\, 



= 4 



E 



Furthermore, by Eq.Q and by independence of the centered noise (£2) and (772), we have: 

E[Z2U^,,,(Zi)] =E[6^„(Xi {Zl)] ■ 

Using Fubini's Theorem and Eq.Q we obtain: 



E[6^„(Xi)<^,^(Zi)] = E 



E 



1 

2^ 



E|6^„(Xi)e'(^i+^i)2' 

my) 



1 



f:iy) 



iyeler{-y)dy 



I 



-E 



2tt 
1 

2^ 



(17) 



Hence, 

where 



: E [b^,^{X,)VglgiX,)]E[b^„iX,)VelgiX,)]' , 



A =E 



Zl {u^,i,{Zi)) 



Calculus of the covariance matrix of Corollary (1.2): 



By replacing {y gmg{Zi)) by its expression (f5l we have: 



B.4 Proof of Corollary 1.2 
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n,{e) = Gov {VelMl - 2Z2u*^^,^{Z,),Ve\Ml ~ 2Z,+i<,,,(Z,)) , 

= 4 [E {Z2U*^^i^{Z,)Z,+,u*^^i^{Z,)) - E {Z^u*^^i^{Z^)) E {Z,+,u*^ ^i^{Z^))] 



By using Eq.(17) and the stationary property of the Zi, one can replace the second term of the above 
equation by: 

E[6^„(Xi)Ve/e(Xi)]E[6^„(Xi)VeZe(^i)]'. 
Furthermore, by using Eq.Q we obtain: 



E [Z2Zj+iu^^i^{Zi)u^^i^{Zj)\ 



E [60o(^l)^0o i^jWveh iZl)Kele (^j)] 

E [(?72 + £2) + l + £»+l) Wve;, (^l)^Ve;(, i^j)] 



(18) 
(19) 
(20) 



By independence of the centered noise, the term (18), (191 and (201 are equal to zero. Now, if we use Fubini's 
Theorem we have: 



Hence, the covariance matrix is given by: 



(21) 



n,{e) - 4(E [h^,{X^)h^,{X,) {VekiXi)) {VekiXj))'] -E[fe^„(Xi) {V ele{Xi))]E[h^,{Xi) {VekiXi))^) 
= 4 - E %,XXi) [VekiXi))] E %,{Xi) (Ve/e(Xi))]'^ 



Finally, we obtain: VL{e) = ni{e) + 2 Y^'f^i ^j{0) with ^i{e) = 4 (P2 - Pi) and 1)^(6') = 4 ( Q - Pi 
Expression of the Hessian matrix Vg : 
We have: 

Pme^\\lg\\l-2{le,lg„) . 



(22) 



For all 6 in O, the application 9 1—^ Pmg is twice differentiable w.r.t 6 on the compact subset Q. And for 



dFr 



= at the point Sq- 



And for j, fc e {1, • • • , r}: 
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C PROOF OF THE APPLICATIONS 



(0) 



C Proof of the Applications 

C.l The Gaussian AR(1) model with measurement noise 
C.1.1 Contrast Function 

We have: 



So that: 



and the Fourier Transform of 1$ is given by: 



2 _ / I, /^M2 . _ 0'7 



\k{x)Vdx = 



l*e{y) = J e'y^le{x)dx = Je'y--L=<Pxexp(^-^x^^ 



dx 



. , d 



d 



~^<p-Q:yK [e'y'^] where G ~ Af{0, 7^) 



dy 

As £i is a centered Gaussian noise with variance cr^, we have: 



^ exp (-2^-^) and ft{x) = exp (-^xV,^ ) . 



Define: 



Then: 



1 i*e{-y) 



1 



xe'y^^^ - exp ( -^x\j^ - al) ) dx, 



1 



1 



E [e*«^] where G ~ ^0, 



1 



07' 



1 



d 



2^ (7'-^l)'/'% 



2(^2-^2) 



07' 



ye 



2(^2_<,2) 



C.l The Gaussian AR(1) model with measurement noise 
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i-e the Fourier Transform of u is given by: 

We deduce that the function me{2,i) is given by: 

me(z,) = WleWl ~ 2z,+iul{zi) 

2z,z,,+i— =07 — — --exp 



''V2^'^' (7'-^?)'/" V 2(72-^1) 



Then, the contrast estimator defined in (l.ll is given by 



9n = argminP„me 



<P'l ^ . . / 1 4 



^'^tS \ AVn ~ innij^ - .2)3/2 ^.+1^. (^"a (^^3^ I ^ ' ° 
C.l. 2 Checking assumptions of Theorem 

Mixing properties. If < 1, the process is geometrically ergodic. For further details, we refer to |Dou94j . 

Regularity conditions: It remains to prove that the assumptions of Theorem |1.1| hold. It is easy to see 
that the only difficulty is to check the moment condition and the local dominance (C)-(T) and the uniqueness 
assumption (CT). The others assumptions are easily to verify since the function le{x) is regular in 9 belonging 

to e. 

(CT): The limit contrast function Pmg : 9 E Q 1-^ Pmg is given by: 

9^Pmg = \\le\\l'2{lg,lg,) 

Ay/n V TT (-^2 _|_ ^2)1 ' 

is differentiable for all in 8 and WgPmg — Or2 if and only if 9 is equal to 9q . More precisely its first derivatives 
are given by: 

dPmg _ 1 07(2 - 02) f2 2.-3/2( 1^+1'^' 3^274 

90 (1-02) V^'P070l7 +7oJ 1^(1-02) (l_02)(^2+^2) 

dPmg _ 0' [2 0o7o f, 3072 



9a2 87^(7(1-02)3/2 V ^(l-</>2)(72 +72)3/2 (7'+7o) 
and 

VgPmg = Ok2 <^ = 00 

The vector of parameters is the only extremum of Pmg in the interior of the compact subset defined 
in Section [2lan 
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The partial derivatives of Ig w.r.t 9 are given by: 



(x) 



1-02 



+ l]x + 



1 



1 



e ^ 



For the reader convenience let us introduce the following notations: 



ai 
bi 



We rewrite: 



2(1 - ^^)r 



and 62 



2(1 - 02)^4 



(23) 
(24) 



dig dig 



= ((aix + a2X^) X go, 72(0;), (61X + 622;^) X .90,72 (x)) , 
where the function 30,7^ defines the normal probability density of a centered random variable with variance 7^. 



Now, we can use Corollary 1.2 to compute the Hessian matrix Ve^: 



dlft 
die dig 



die die 
dip ' da^ 

II die ||2 
II dcr^ II2 



(25) 



1 



aibiE[X^] + aib2E[X^] + a2biE[X'^] + a2b2E[X^ 



joV^ \aibiE[X^] + aib2E[X^] + a2biE[X^] + a2&2E[X6] blE[X^] + 2hb2E[X^] + blE[X^] 

with X ^ Af fo, . By replacing the terms oi, 02, &i and 62 at the point 6q we obtain: 



Vf, 



1 



^ 7OU0O - ^-^o + 4J 27o(l-0g) 

8V7f(l - (/)g)2 1 -50g+30g+20o 70^ 



(26) 



27o(l-0g) 



which has a positive determinant equal to 0.0956 at the true value 6*0 = (0.7,0.3). Hence, 6*0 is the only local 
minimum of the function Pm^ in the interior of the compact space Q. 



(C): (Local dominance): We have: 



E 



sup \Z2U*i {Zi)\ 



27r 



E 



sup 



(72 - (7|)(3/2) 



Z2Z1 exp — 



2(72 - cr2 



The multivariate normal density of the pair Zi = {Zi, Z2) denoted g(o.j) is given by: 

^ det ijy^^^ exp (~^z[j-'zi 



C.l The Gaussian AR(1) model with measurement noise 
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with: 



+ 72 \ 1 _ 1 /^cr^ + 72 -(/,72 

^^2 ^2 + ^2 j and J - ^^^^ ^ ^,^2 _ ^ ^2 ^ ^.y • 



By definition of the parameter space G and as all moments of the pair Zi exist, the quantity E [supggg |-^2wfg 
is finite. 

Moment condition (T): We recall that: 



= {{aix + a2X^) X 50,7^(2;), {hx + b2X^) x gQ^^2{x)) 



The Fourier transform of the first derivatives are: 



= j ex.p{ixy) {aiy + a2y^) x go^^2{y)dy 



= -iaiE [iG exp (ixG)] + ia2E [-iG^ exp (ixG)] where G ~ J\f{0, 7^) 
d 

= —iai—E [exp (ixG)] + 2027— [exp (ixG)] 
ox ox-^ 
d / a;2 A , 93 



And: 



= exp ^--7 j+,a2^exp|^--7^ 

f- 2 I o- 4 ■ 6 3\ ( 2 

= [laij X + 01027 ^ ~ 1(121 X ) 6xp I — ^7 



(a;) j = {ihii^x + 3*627'* a; - ih2l*^x^) exp ( - 



—72 
2 ' 



We can compute the function u^gig{x): 
1 (t(--))* 



Uoig {x) 



27r 

' (72- ^2)1/2 ^^p/_^(^2_^2)A r 1 



= -iC {Axx - A2ZI) 1 (x), 

with C = and Ai = ai72 + 8027^ = 7^ and A2 = 027*^ = 7^(T^- 

The Fourier transform of the function umg_{x) is given by: 



-^Tyj2 ((-^'^iT^ - 3ia27*)a; + ^27^0;^) | , 
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u%,g{x) = -iC {iyx) {Aiy - A2y^) g / \{y)dy, 

— d — /I 

= -CAi — E [exp {ixG)] - CA^ — [exp [ixG)\ where G ~ A/" 0, _ . , 
ax ox \ (7 cr^j/ 



= -C^4(exp(-^^^^ 



(*f X + *f x^) exp 



2(7^-07' 



with *f = C (^ (^al'-^.g) - (.^2^4^2)2 ) and *^ = C (^ (^2^^2)3 ) • By the same arguments, we obtain: 

U*ai, {x) = i^fx + ^fx^\ exp 

»„2 \ / 



2(72 - a| 



with vl/f ^ C ( - p^^^) , *f = C ((^5^) , = 617' + 3627' 

and B2 = &27^ = 7^ — — 



2(1-02)- 



2+(51 . 



Hence, for some 5 > 0, E \Z2U^ i (Zi)\ is finite if: 



(27) 



(28) 



E 



E 



^1 

2(7^ - <yl) 



(*fZiZ2 + *tzfZ2)exp 
{n'z,Z2 + n^ZlZ,) exp (-2^:^ 



2+<5' 



< 00, 



2+5' 



< 00. 



which is satisfied by the existence of all moments of the pair Zi. One can check that the Hessian local assumption 
(T) is also satisfied by the same arguments. 



C.1.3 Explicit form of the Covariance matrix 

Lemma C.l. The matrix S(^o) *^ the Gaussian AR(1) model is given by: 



with 



And 



where: 



Veo 



Wo(7^^ - + 4) -^^^0+3^^+200 



27o(l-</'g) 



47? 



fl{0o) = ftiiOo) + 2 J2 "j (^o) = 4 [P2 - Pi] + 8 ^{Cj - Pi) 

p _ 647^(1-0^)2 1287r(l-0g)2 



\ 't>a{2-4>o) <Fo 

And P2 is the 2x2 symmetric matrix multiplied by a factor 



given by: 



and its coefficients {Pim)i<i,m<2 are 



C.l The Gaussian AR(1) model with measurement noise 
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V2^ (70^ - ^1)3/2 V (1 - cj^l) (1 - </.2)(72 - al)) ■ 

1 1 / 00 3^o7|^_\ 

72^ (7o^ - ^1)3/2 1(1 - 0§) 2(1 - 02)(^2 _ ^2); • 

1 1 7o^0g 

1 1 7g'/'o 

^ (7o^ - a|)7/2 2(1 - cj^l) 



The covariance term are given by: 



(#^ - 40g + 1)^1 (j) + ^^"(;:^^") g,(j) + ^c~3(j) ^^^^|^5i(j) + ^^^52(j) + 453(j) 
' ^^^^||^5i(j) + ^^^^^^52(^) + 4£3(j) 45i(j)-4'^2(j) + ^53(j) 



with A ^ 2^7g(l-0g)^ ' 



52(j) 
53(j) 



5i(^-) = ^(2-0^T^/Vfv+-^ 



(2-0^^)2^ 



a2J 



70 

-(2-0^^")-'/V(^ + 5^2, 

3(2-^^^^5/2 
70 ^ 



3V^ + 5y,-(4V + 2) 



^^^^ 



(2-0^^)2 ^ (2~0^^")4 



where: 



(2-0^^)2-02, 2_ 



Moreover linij^oo f^j(6'o) = 



Remark. In practice, for the computing of the covariance matrix rij(6') that appears in Corollary 1.2 we have 
truncated the infinite sum {qtmnc = 100). 
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Proof. Calculus of Vm 

For all X e M, the function le{x) is two times differentiable w.r.t 9 on the compact subset Q. More precisely, 
note that since 7^ = cr^/(l — 0^), it follows from the definition of the subset Q that (7^ — ct^) > 0. So that for 
all Zj in the function me(zj) : 9 £ Q 1-^ mg{zi) is differentiable and: 



Ve(me(zj)) 



/ dmg{z,) dmg{zi) \' 

\ d<j> ' 9(72 J 

d\Ml 



d\M2 r. * f \ 



22:,+itt*3,g (2:,) , 



with: 



d 



17 l|2 



07(2 - 



4V^(l-(/.2)^ 
^ 11/ ||2 '^^ 



/7F(l-(/)2) 



And, the function Wg,^ (x) and u*a,^ (x) are given in Eq.(27)-(28|. Therefore, 

04> Ocr^ 



Vgme{zi) = 

Calculus of Pi: 

Recall that we have: 



at the point 9q. 



(29) 



Fi = E [690(^1) (V9;e(^i))]E [600(^1) (V(,;e(Xi))]' 



Fo = E 



And the moments {n2k)k£fi of a centered Gaussian random variable with variance are given by: 



V 2'^fc! 



We define by P{x) a polynomial function of ordinary degree. We are interested in the calculus of E [P{X)gQ ,y2 (AT)] , 
where X r^Af{0,j'^). We have: 



E[P(A)5o.^^(A)] 



1 _ 1 

P{x)^^^e ^ ^ -e ^ dx 



1 



27172 
1 

27^7 



27r7 
P(x)e i"^ dx 



E [P(A)] 



2777 



where A - A/" 0, 



7 
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Denote by Bi the constant 



We obtain: 



Pi = 



E 



E 



,E 



E[b^,ix,)§j^{e,x,)f 



= B 



E[i/n(X)]' E[77i2(X)]E[i/2iW] 

E [i?21 (^)] E [i/l2 (^)] E [iJ22 (^)] ' 



70^ 



where X ^ M [O, — y The polynomials {Hij{x))^^. .^^ are given by: 



-ffii(a;) = (flix^ + 022;'*) , 
i?i2(a;) = (6ia;2 + 62:^*) , 



Lastly, by replacing the terms Bi, ai, and 02 by their expressions given in Eq.(23) at the point 9q, wc obtain 



/ 0^70^(2-0^)^ 0^(2-0^) 

Pi=E[6,„(Xi)(V,/,(Xi))]E[6^„(Xi)(Ve/,(Xi))]'= 'Z'jZ'^^^ i2Srriytr 

Vl287r(l-0g)2 2567r(l-0g)27o', 



Calculus of P2: 



E 



(Z2«^,,,(Zi)) (Z2«^^,^(Zi))' 



E 



^2 UW(^i) 



E 



^2 (""W(^i)) 



E 



E 



^2 "*a,, (^1) 



We have: 



E 



E 



+ 2vE'f°*|'°E 



■^2 2'? X g^^ (-,2_,2) 



^-^N E 



^2^1 X (Tg-^i) ' 



(30) 



The density of Zi is g(o,j)- Then, g(o.,j) x exp (^- ^^gl'^g) ] is equal to: 
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1 

1 



1 

v2^2 _ ^,4j,2\1/2 



cxp 



1 



2^ ((^?+7o')'-7oVg)'/' 



X exp 



2 (7g+'^g) 

2^^ V(7o'-'^l) (('^l+7o')^-7oV^) 



X exp 



1 



(7o' + <^e') 



2 H((^l+7g)2-7oV§) 



cxp 



2</'o7o 



27r 



exp 



■l(v'-^(z2-^07?zi)') 



((^l+7g)^-7oV§) 



-^t^ = + (r^??^) (1 - '^070^)) and V, 



Then, we obtain: 



9{o,j) X exp 



1/2t>1/2^Z2 



(^2)4S>.^(^l)- 



In the following, we note = (^^2_|_^2j"2_^4^2 Vi^^V2^'^ ■ Now, we can compute the moments: 



E 



ZfZ^exp 



^1 



(70^-^1) 



= V y z2g(o,v,)(2i)rf^iE [G^] where G ~ AA(</.o7o^i, V^2), 

= (*f°)Vyi(v2 + 3</)27o4yi). 



In a similar manner, we have: 



E 



exp 



(7o'-^l) 



= {^t)" ^ j 49{<iMzi)dzi¥.[G^] where G~AA((/.o7o'2i,V^2), 

= 15 (*^°) ' ^1/214' + 105 (*^«) ' F<l>htVt, 
= 15 ' TV^ (^2 + 7</.274Vi) . 



C.l The Gaussian AR(1) model with measurement noise 
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And: 



= 2-ft«^pT J ztg^o.v,M)dziE [G^] where G ^ J^{<l>olo^i,V2) 



2* 



00 ,T,0O • 



By replacing all the terms of Eq. ( 30 1 we obtain: 



E 



^2 UX(^l) 



And: 



E 



And, 



^2 ( U*oie 

Oct'-' 



(31) 



mf^fTVl (V2 + 50^^o'^i) ■ (32) 



E 



2 1 ^Q'e 

00 



ZlZ\ X 5 



0,^ 



+ ^'^°*^ E 



zizX X 5 



-^-f E 



X 5 



(33) 



Calculus of Gov (VeTO6((^i), Veme(Zj)): We want to compute: 

Gov (V0me(Zi), Vem0(Z,)) = 4 



Since we have already computed the terms of the matrix P\ , it remains to compute the terms of the covariance 
matrix Cj given by: 

C, = E [6^„(Xi)6^„(Xj) (Ve/e(Xi)) (VeZe(X,))'] . 
For all J > 1, the pair (Xi^Xj) has a multivariate normal density g(o,w) where W is given by: 



W = 7o 
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The density of the couple {Xi,Xj) is: 

5(o,w)(a;i,a;j) = ^ det (W)"^^^exp (^-^{xi,Xj)'w~'^{xi,Xj)j . 



We start by computing: 



We have: 



9{o,w){xi,Xj) X exp (^-^ {^1 + '^j) j • 



9{o,w){xi,Xj) X exp (^-^{^1 

= ^ det (W)-^/' exp l^^—i^ (a;?(l - 4^) + x%\ - ^x\- 2ct>ix^xj + x^^fj , 
= ^ det {W)-'/' exp (^-—^^^ [(a;f (1 - 4^) +xl- 24x,x,) + {x]{l - 4^) + x]) 

(2 - 4^) (^xl - 2^^A-xix, j + - 4J) + x^) j , 



= ^ det (W)"'/' exp ( K--^ 

= ^det(W)-/^xp (- L - 



2N 



, 2(1-0^^)70^ ' V (2-'/'o'■)^ 
For all j > 1, we define: 



(2-^J')»-^? (2-#) 



We can rewrite: 



5(o,w)(a;i,a:j) X exp ^{xl + x"])^ 



_ Vl'^ 1 (^_^ 2\ 1 

" 70 ,/2^V.V2 1, 2y/^J ^ (2-0^,y/,^^,/2^"P(^ 2V 1^^^ (2-0^^) 
So, by Fubini's Theorem, we obtain: 



E 



C.l The Gaussian AR(1) model with measurement noise 
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where - A/" ( ,V ) . Thus, E[Xf] = V + ( ,f"%X ■ We obtam; 



E 



1/2 



70 



(2-0^-') 



1/2 



exp 



2y, 



1/2 



J 

70 

3/2 



70 

ci(i), 



(2-</>o') 



2j^2 



(34) 



where Xj ^ A/'(0, V, ). Additionally, we have: 



E 



70 ^ "^"^ V 

C2{j). 



(35) 



Now, we are interested in E 



In a similar manner, we obtain: 



E 



X^X^ exp 



1/2 



70 



1 

1 



27rTA 



1/2 



exp 



2V, ^ 



(36) 



where Xi ^ N 



,(2-0^-'') 



, V . We use the fact that the moments of a random variable X ~ v) are: 



E [X"] = {n - l)wE [A:"-2j ^ 



e[a:i'] = 3VE[a:^] 



3V^ + (4V + 2)- 



(2-0o') 



¥.[Xt] 



(2-<A^^) 



By replacing E[X2'] in equation (361, we have: 
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E 



^ 3(2-0^^^5/2 

70 ^ 
For all J > 1, the matrix Cj is given by: 



SV^ + 5V,(4V + 2) 



35W 



C, = A 



aiciiJ) + 2aia2C2{j) + alcsij) 
aibici{j) + aib2 + 02^1 £2 (j) + a2b2h{j) 



aibici{j) + 0162 + a2biC2{j) + a2&2C30') 
blci{3)+2bib2~C2{j) + blc3{j) 



where the coefBcients ci(j), C2(j), and c^{j) are given by (34), (35) and (37l and: 

A = 



2^70^(1-02)2- 

Finally, by replacing the terms ai, 02, &i and 627 the matrix Cj is equal to: 



(37) 



C,^A 



(404 _ m + 1)5, (j) + 20g(i^-20g) g^(^.) ^ + Mi=p5,Q-) + ^53(^y 



27S 



4c~i(,) + 4J2(j) + §53(j) 



Asymptotic behaviour of the covariance matrix Q,j{6i^): By the stationary assumption |0o| < 1> the 
limits of the following terms are: 



And, 



Therefore, 



We obtain: 



lim V, ^ ^ and lim Vi = 



lim c,U) = 4^ 1™ ^2(j) = lim 53 (j) 



32 ■ 



lim Cj = 



>g7g(2-0g)" 
64^(1-02)2 

_1287r(l-0g)2 



0g(2-0g) 
1287r(l-02)2 

25Mr=0fFy?, 



lim Gov (Veme„(Zi), VeTOe„(Zj)) = 4 lim (C^ - Pi) 

J— >oo j^oo 

= 0^2X2- 

We conclude that the covariance between the two vectors V emg^{Zi),'\I 0mQ^^{Zj) vanishes when the lag 
between two observations Zi and Zj goes to the infinity. 



Calculus of Vq^ : The Hessian matrix Ve^ is given by ( 26 ) 



□ 



C.2 The SV model 
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C.2 The SV model 
C.2.1 Contrast function 

The L2-norm and the Fourier transform of the function Ig are the same as the Gaussian AR(1) modeL The 
only difference is the law of the measurement noise which is a log-chi-square for the log-transform SV model. 

Consider the random variable e = /3\og{X^) — £ where £ = /3E[log(V^)] such that e is centered. The random 
variable V is a standard Gaussian random. The Fourier transform of e is given by: 



E [exp (iey)] = exp (-i£y) E [X^^^^] 



By a change of variable z = ^ , one has 



exp (^-i£y^ -J^ [ a:^''^^exp f ) dx 



C30 



E [exp (iey)] = exp i^-i£yj J z'^^y-^e'^dz 



exp[-^£y) r 



and the expression ( 13 ) of the contrast function follows with u/^ (y) 



2v^ I cxp(-i£a)2*'^!'r(i+i,3i/) 



C.2. 2 Checking assumption of Theorem 

Regularity conditions: The proof is essentially the same as for the Gaussian case since the functions le{x) and 
Pmg are the same. We need only to check the assumptions (C) and (T). These assumptions are satisfied since 
Fan (see |Fan91j ) showed that the noises have a Fourier transform /* which satisfies : 



|/,*(x)|=v^exp(-||:.|) (1 + 



oo, 



which means that is super-smooth in its terminology. Furthermore, by the compactness of the parameter 
space 8 and as the functions Ig, and for j,k e {1,2}, the functions (fg^)* [ g^ ^Qg^. )*; have the following form 
Ci{9)P{x) exp {—C2{0)x^) where Ci{9) and C2{0) are two constants well defined in the parameter space Q with 
C2{0) > 0, we obtain: 




2+5 



< OO 



< oo 



for some 5 > 0, 

for some neighbourhood U of ^o- 



C.2. 3 Expression of the Covariance matrix: 

As, the functions lg{x) and Pmg are the same for the two models, the expressions of the matrix Vg^ and 
il,j{6o) are given in Lemma IC.ll We need only to use an estimator of P2 = E[Z|(uy;^ (Zi))^] since we can just 
approximate Uyi^{y). A natural and consistent estimator of P2 is given by: 
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^ n — 1 

^2--E(^m("v;.(^.))'), (38) 



1=1 



Remark. In some models, the covariance matrix ilj{9n) camiot be explicitly computable. We refer the reader 
to |HayOO| chapter 6 Section 6.6 p. 408 for this case. 



D EM algorithm 

We first refer to |DLR77| for general details on the EM algorithm. The EM algorithm is an iterative procedure 
for maximizing the log-likelihood l{9) = log{fg{Zi-n))- Suppose that after the fc*'' iteration, the estimate for 9 
is given by 9^. Since the objective is to maximize l{9), we want to compute an updated 9 such that: 

m > i{9k) 

Hidden variables can be introduced for making the ML estimation tractable. Denote the hidden random 
variables C/i:„ and a given realization by mi:„. The total probability fe{Zi-n) can be written as: 

feiZi-.n) = fejZi-.nlui-.n) fejui-.n) 

Ul:„ 

Hence, 



m - i{9k) 



= log(/e(Zi.„))-log(/9,(Zi.„)), 

= log f0i^i--n\ui:n)fe{ui:n)^ - log(/e^ (Zi:„ ) ) , 

1 ( J- /'7 I \r/ N /Sfc (Wl:n|^l:n) 

= log I 2^ /eaUl:ri|^l:«) 



^fe,{ui:n\Zi.,n)\og 

Ml:n 

^fe,iui:n\Zl:n)iog 



f9k{ui:n\Zl:n) 

-.71 J 

:n |^l:n 
fek{ui:n\Zl:n) 

fe{Zi,n\ui;n)fe{ui;n) 
fOk (Wl:n|^l:n)/efc(^l:n) 



l0g(/e,(Zl:„)), 
l0g(/eJZl:„)), 

log(/e,(^i:«)), 



(39) 
(40) 



\og{fe,{Z,..„))Y,feAni:n\Z,..^), (41) 



- Ai9,9k). 



In going from Eq.(39) to Eq.(40) we use the Jensen inequality: log^"^-^ Aia;i > K^og{xi) for constants 

Ai > with J27=i = 1- ^"^^ going from Eq.(40) to Eq.(41) we use the fact that X]«i.„ /st ('«i:„|Zi:„) — 1. 

Hence, 

m > l{9k) + A{9, 9k) = C{9, 9k) and A(0, 9k) = for = 0^ 

The function C{9,9k) is bounded by the log-likelihood function l{9) and they are equal when 9 — 9k- Conse- 
quently, any 9 which increases £{9, 9k) will increases l{9). The EM algorithm selects 9 such that C{9,9k) is 
maximized. We denote this updated value 9k+i. Thus, 



D.l Simulated Expectation Maximization Estimator 
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9k+i = argmax < /(^fc) + ^ /ejMi:„|Zi:„)log 

I Ml:n 



feiZl:n\ui:n)f0iui:n) 



,/efc(ui:n|^l:n)/efc(^l:n), 

= argmax 1^^^ /gj. (mi:„|2'i:„) log /e(2'i:„|Mi:„)/e(ui:„) I if we drop the terms which don't depend on 0. 
= argmax{E[log/6)(Zi:„|ui:„)/e(ui:„)]} where the expectation is according to /g^, (mi:„|Zi:„). (42) 



D.l Simulated Expectation Maximization Estimator 

Here, we describe the SIEMLE proposed by Kim, Shepard and Chib |KSC98| for the SV model, these authors 



retain the linear log-transform model given in (12 1. However, instead of approximating the log-chi-square 
distribution of with a Gaussian distribution, they approximate Si by a mixture of seven Gaussian. The 
distribution of the noise is given by: 



fei{x) « 9 j ^ 9{mj,v^){x) where g(^rn,v){x) denotes the Gaussian distribution of £i with mean m and variance 

7 

where fei\si=j{x) is a Gaussian distribution conditional to an indicator variable Si at time i and the variables 
Qj, J = 1 • • • ,7 are the given weights attached to each component and such that 1j ~ 1- Note that, most 

importantly, given the indicator variable Si at each time i, the log-transform model is Gaussian. That is: 

Then, conditionally to the indicator variable s;, the SV model becomes a Gaussian state-space model and 
the Kalman filter can be used in the SIEMLE in order to compute the log-likelihood function given by: 

2 



log/,(^i:„ki:„) - -^log(2^) 'lj2^ogF, 



2 ^ ° 2^ F, 

i=l i=l ' 



with 1^, = (Zj - 4 - TOsJ and = Yelvt] = P, + w^.- The quantities = Ee[Z,\Zi.,,_i] and = 
Yg[{Xi — X^)'^] are computed by the Kalman filter. 

Hence, if we consider that the missing data ui:„ for the EM correspond to the indicator variables si:„, then 



according to Eq.(42) and since f{si-n) do not depend on 6, the Maximization step is: 

Ok+i = argmax{E[log/e(Zi:„|si:„)]} = argmaxQ(6', 0fc) 

6 

where the expectation is according to /e^ (si:n|^i:n)- Nevertheless, for the SV model, the problem with the 
EM algorithm is that the density fe{si:n\Zi:n) is unknown. The main idea consists in introducing a Gibbs 

algorithm to obtain M draws s[^^, • • • , s^'!^'' from the law /e(si:„|Zi:„). Hence, the objective function Q{9,9k) 
is approximated by: 

M 



Then, the simulated EM algorithm for the SV model is as follows: Let C > be a threshold to stop the 
algorithm and 9k a given arbitrary value of the parameter. While \9k — 9k-i\ > C, 
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1. Apply the Gibbs sampler as follows: 

The Gibbs Sampler: Choose arbitrary starting values x['!^, and let I = 0. 

(a) Sample s'^lf^ ^ fg^{si.,„\Zi.,n, x[l). 

(b) Sample x['+''^ ~ fg,{X,..^\Z,..^, s[[+'^). 

(c) Set 1 = 1 + 1 and goto (a). 

2. dk+i = a,rgma.xeQ{9,9k)- 

Step (a): to sample the vector si:„ from its full conditional density, we sample each Sj independently. We 
have: 

n II 
r=l r— 1 

and fe^ {Zr\sr = j, Xr) oc g(x^+mj,v^) for i = 1 • • • ,7. And the step (b) of the Gibbs sampler is conducted by 
the Kalman filter since the model is Gaussian. 
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